ollama 0.1.31 -> 0.1.33 #309330

nsbuitrago · 2024-05-05T17:39:42Z

Description of changes

release notes: https://github.com/ollama/ollama/releases/tag/v0.1.33
synchronize ollama patches with ollama/llm/patches in 0.1.33
removed replacement $CXX with g++ in postPatch for compat with ollama/llm/generate/gen_common.sh in 0.1.33

Things done

Add a 👍 reaction to pull requests you find important.

drupol

The commit log message is wrong, this is why the CI is not triggering the tests.

https://github.com/ollama/ollama/releases/tag/v0.1.33

abysssol · 2024-05-05T23:26:03Z

I tested this (by running nix-build -A ollama.passthru.tests in the nixpkgs root), and 0.1.33 has the same problem as 0.1.32 (see #304823) in that it breaks the rocm compilation. It seems that everything builds right, until the end when a check in the build script finds that the compiled artifact doesn't have any dependencies on the rocm libraries. I did test what happens if I disable the check and run the compiled ollama binary anyway, but it just crashes almost instantly after running it.

So, I'm against merging this into the current nixpkgs-unstable, since this is a breaking change that removes support for rocm, and breaking changes are restricted since the next stable release will be soon (#303286).

I may be alright with merging this into the next unstable after nixos-24.05 has released, but until then I think it would be a bad idea to merge a breaking change.

abysssol

Also, since it's known that rocm is broken, meta.broken should be updated accordingly, unless/until the rocm build can be fixed.

   meta = {
     description = "Get up and running with large language models locally";
     homepage = "https://github.com/ollama/ollama";
     changelog = "https://github.com/ollama/ollama/releases/tag/v${version}";
     license = licenses.mit;
     platforms = platforms.unix;
+    broken = enableRocm;
     mainProgram = "ollama";
     maintainers = with maintainers; [ abysssol dit7ya elohmeier ];
   };

onny · 2024-05-10T10:35:31Z

ollama 0.1.34 released

onny · 2024-05-10T11:01:56Z

If possible please move the package derivation to pkgs/by-name

redyf · 2024-05-11T13:17:50Z

ollama 0.1.36 has been released
https://github.com/ollama/ollama/releases/tag/v0.1.36

ejiektpobehuk · 2024-05-12T12:18:21Z

ollama 0.1.37 has been released

nsbuitrago · 2024-05-13T02:32:19Z

I tested this (by running nix-build -A ollama.passthru.tests in the nixpkgs root), and 0.1.33 has the same problem as 0.1.32 (see #304823) in that it breaks the rocm compilation. It seems that everything builds right, until the end when a check in the build script finds that the compiled artifact doesn't have any dependencies on the rocm libraries. I did test what happens if I disable the check and run the compiled ollama binary anyway, but it just crashes almost instantly after running it.

as far as missing dependencies, I was able to get passed this using the rocm-core package in rocmPath. There is a collision between rocmPackages-clr and clr from rocm-core, so I removed it for now to test.

  rocmPath = buildEnv {
    name = "rocm-path";
    paths = [
      rocmPackages.hipblas
      rocmPackages.rocblas
      rocmPackages.rocsolver
      rocmPackages.rocsparse
      rocmPackages.rocm-device-libs
      rocmClang
      rocmPackages.rocm-core
    ];
  };

I also had to set the CLBlast_DIR path and append some paths to CMAKE_PREFIX_PATH at the pre-build step for this to work for me.

  preBuild = ''
    # disable uses of `git`, since nix removes the git directory
    export OLLAMA_SKIP_PATCHING=true
    # build llama.cpp libraries for ollama
    # set CLBlast_DIR and append ROCM path to CMAKE_PREFIX_PATH
    ${lib.optionalString enableRocm
      ''export CLBlast_DIR="${clblast}/lib/cmake/CLBlast" \
      export CMAKE_PREFIX_PATH="${rocmPath}:${rocmPackages.rocm-comgr}/lib/cmake:$CMAKE_PREFIX_PATH" ''}
    go generate ${lib.optionalString enableRocm ''-tags rocm''} ./...
  '';

However, this brings me to other errors, specifically with building for gfx1010. It seems like some libraries are still not built correctly. I have played around with the build script but have had no luck for now.

volfyd · 2024-05-13T18:04:46Z

I haven't fully tested, but I was able to get ollama to see my AMD card. It may just work?

Here's what I changed from your original PR from last week:

diff --git a/pkgs/tools/misc/ollama/default.nix b/pkgs/tools/misc/ollama/default.nix
index 1934ac80a..b0dc7a9da 100644
--- a/pkgs/tools/misc/ollama/default.nix
+++ b/pkgs/tools/misc/ollama/default.nix
@@ -24,7 +24,7 @@
 
 , config
   # one of `[ null false "rocm" "cuda" ]`
-, acceleration ? null
+, acceleration ? "rocm"
 }:
 
 let
@@ -104,6 +104,13 @@ let
   };
 
   runtimeLibs = lib.optionals enableRocm [
+    rocmPackages.clr
+    rocmPackages.hipblas
+    rocmPackages.rocblas
+    rocmPackages.rocsolver
+    rocmPackages.rocsparse
+    rocmPackages.rocm-device-libs
+    rocmClang
     rocmPackages.rocm-smi
   ] ++ lib.optionals enableCuda [
     linuxPackages.nvidia_x11
@@ -166,6 +173,7 @@ goBuild ((lib.optionalAttrs enableRocm {
   postPatch = ''
     # replace inaccurate version number with actual release version
     substituteInPlace version/version.go --replace-fail 0.0.0 '${version}'
+    substituteInPlace llm/generate/gen_linux.sh --replace-fail 'exit 1' ""
   '';
   preBuild = ''
     # disable uses of `git`, since nix removes the git directory

Obviously changing the default acceleration to rocm was just so I could run

nix build .#ollama

I don't know if there is a way to enable rocm when running that build command.

Also, the substituteInPlace should really be a patch (or even an upstream bug?)

Basically the problem is that the upstream change here

ollama/ollama#3218

made it so that we need the rocm libraries as runtime dependencies instead of build time, since they aren't fully pulled in until runtime. I think anyway, I haven't completely tested this.

Here is it working (at least to the point of seeing my AMD card):

$ HSA_OVERRIDE_GFX_VERSION="11.0.0" result/bin/ollama serve 
time=2024-05-13T13:58:50.843-04:00 level=INFO source=images.go:828 msg="total blobs: 0"
time=2024-05-13T13:58:50.843-04:00 level=INFO source=images.go:835 msg="total unused blobs removed: 0"
[GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.

[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
 - using env:	export GIN_MODE=release
 - using code:	gin.SetMode(gin.ReleaseMode)

[GIN-debug] POST   /api/pull                 --> github.com/ollama/ollama/server.(*Server).PullModelHandler-fm (5 handlers)
[GIN-debug] POST   /api/generate             --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (5 handlers)
[GIN-debug] POST   /api/chat                 --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (5 handlers)
[GIN-debug] POST   /api/embeddings           --> github.com/ollama/ollama/server.(*Server).EmbeddingsHandler-fm (5 handlers)
[GIN-debug] POST   /api/create               --> github.com/ollama/ollama/server.(*Server).CreateModelHandler-fm (5 handlers)
[GIN-debug] POST   /api/push                 --> github.com/ollama/ollama/server.(*Server).PushModelHandler-fm (5 handlers)
[GIN-debug] POST   /api/copy                 --> github.com/ollama/ollama/server.(*Server).CopyModelHandler-fm (5 handlers)
[GIN-debug] DELETE /api/delete               --> github.com/ollama/ollama/server.(*Server).DeleteModelHandler-fm (5 handlers)
[GIN-debug] POST   /api/show                 --> github.com/ollama/ollama/server.(*Server).ShowModelHandler-fm (5 handlers)
[GIN-debug] POST   /api/blobs/:digest        --> github.com/ollama/ollama/server.(*Server).CreateBlobHandler-fm (5 handlers)
[GIN-debug] HEAD   /api/blobs/:digest        --> github.com/ollama/ollama/server.(*Server).HeadBlobHandler-fm (5 handlers)
[GIN-debug] POST   /v1/chat/completions      --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (6 handlers)
[GIN-debug] GET    /                         --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers)
[GIN-debug] GET    /api/tags                 --> github.com/ollama/ollama/server.(*Server).ListModelsHandler-fm (5 handlers)
[GIN-debug] GET    /api/version              --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers)
[GIN-debug] HEAD   /                         --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers)
[GIN-debug] HEAD   /api/tags                 --> github.com/ollama/ollama/server.(*Server).ListModelsHandler-fm (5 handlers)
[GIN-debug] HEAD   /api/version              --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers)
time=2024-05-13T13:58:50.843-04:00 level=INFO source=routes.go:1071 msg="Listening on 127.0.0.1:11434 (version 0.1.33)"
time=2024-05-13T13:58:50.843-04:00 level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama1030226676/runners
time=2024-05-13T13:58:50.987-04:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm]"
time=2024-05-13T13:58:50.987-04:00 level=INFO source=gpu.go:96 msg="Detecting GPUs"
time=2024-05-13T13:58:50.988-04:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-05-13T13:58:50.988-04:00 level=WARN source=amd_linux.go:49 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2024-05-13T13:58:50.988-04:00 level=INFO source=amd_linux.go:217 msg="amdgpu memory" gpu=0 total="20464.0 MiB"
time=2024-05-13T13:58:50.988-04:00 level=INFO source=amd_linux.go:218 msg="amdgpu memory" gpu=0 available="20464.0 MiB"

abysssol · 2024-05-14T10:56:57Z

@volfyd I tested these changes with 0.1.37, but it doesn't appear to work for me; I'm not sure if I'm doing something wrong. My changes are in this branch. I did test with rocmClang included as well like your diff, that didn't work either.

Ollama does run, but doesn't detect the libraries. This is the relevant section of ollama's debug logs. I also confirmed that only the cpu is used when actually running a model.

time=2024-05-14T06:38:09.748-04:00 level=INFO source=routes.go:1052 msg="Listening on 127.0.0.1:11434 (version 0.1.37)"
time=2024-05-14T06:38:09.753-04:00 level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama902218830/runners
time=2024-05-14T06:38:09.869-04:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm]"
time=2024-05-14T06:38:09.869-04:00 level=WARN source=amd_linux.go:48 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2024-05-14T06:38:09.870-04:00 level=WARN source=amd_linux.go:346 msg="amdgpu detected, but no compatible rocm library found.  Either install rocm v6, or follow manual install instructions at https://github.com/ollama/ollama/blob/main/docs/linux.md#manual-install"
time=2024-05-14T06:38:09.870-04:00 level=WARN source=amd_linux.go:278 msg="unable to verify rocm library, will use cpu" error="no suitable rocm found, falling back to CPU"
time=2024-05-14T06:38:09.870-04:00 level=INFO source=types.go:71 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="62.0 GiB" available="2.6 GiB"

volfyd · 2024-05-14T17:47:15Z

@abysssol Maybe you need HSA_OVERRIDE_GFX_VERSION?

I tried 0.1.37 and it is working for me.

diff --git a/pkgs/tools/misc/ollama/default.nix b/pkgs/tools/misc/ollama/default.nix
index fdda6ba3f1e8..73f9b08c2ca0 100644
--- a/pkgs/tools/misc/ollama/default.nix
+++ b/pkgs/tools/misc/ollama/default.nix
@@ -24,28 +24,28 @@
 
 , config
   # one of `[ null false "rocm" "cuda" ]`
-, acceleration ? null
+, acceleration ? "rocm"
 }:
 
 let
   pname = "ollama";
   # don't forget to invalidate all hashes each update
-  version = "0.1.31";
+  version = "0.1.37";
 
   src = fetchFromGitHub {
     owner = "jmorganca";
     repo = "ollama";
     rev = "v${version}";
-    hash = "sha256-Ip1zrhgGpeYo2zsN206/x+tcG/bmPJAq4zGatqsucaw=";
+    hash = "sha256-ZorOrIOWjXltxqOXNkFJ9190EXTAn+YcjZZhDBJsLqc=";
     fetchSubmodules = true;
   };
-  vendorHash = "sha256-Lj7CBvS51RqF63c01cOCgY7BCQeCKGu794qzb/S80C0=";
+  vendorHash = "sha256-zOQGhNcGNlQppTqZdPfx+y4fUrxH0NOUl38FN8J6ffE=";
   # ollama's patches of llama.cpp's example server
   # `ollama/llm/generate/gen_common.sh` -> "apply temporary patches until fix is upstream"
   # each update, these patches should be synchronized with the contents of `ollama/llm/patches/`
   llamacppPatches = [
     (preparePatch "03-load_exception.diff" "sha256-1DfNahFYYxqlx4E4pwMKQpL+XR0bibYnDFGt6dCL4TM=")
-    (preparePatch "04-locale.diff" "sha256-r5nHiP6yN/rQObRu2FZIPBKpKP9yByyZ6sSI2SKj6Do=")
+    #(preparePatch "04-locale.diff" "sha256-r5nHiP6yN/rQObRu2FZIPBKpKP9yByyZ6sSI2SKj6Do=")
   ];
 
   preparePatch = patch: hash: fetchpatch {
@@ -103,6 +103,13 @@ let
   };
 
   runtimeLibs = lib.optionals enableRocm [
+    rocmPackages.clr
+    rocmPackages.hipblas
+    rocmPackages.rocblas
+    rocmPackages.rocsolver
+    rocmPackages.rocsparse
+    rocmPackages.rocm-device-libs
+    rocmClang
     rocmPackages.rocm-smi
   ] ++ lib.optionals enableCuda [
     linuxPackages.nvidia_x11
@@ -164,9 +171,10 @@ goBuild ((lib.optionalAttrs enableRocm {
   ] ++ llamacppPatches;
   postPatch = ''
     # replace a hardcoded use of `g++` with `$CXX` so clang can be used on darwin
-    substituteInPlace llm/generate/gen_common.sh --replace-fail 'g++' '$CXX'
+    #substituteInPlace llm/generate/gen_common.sh --replace-fail 'g++' '$CXX'
     # replace inaccurate version number with actual release version
     substituteInPlace version/version.go --replace-fail 0.0.0 '${version}'
+    substituteInPlace llm/generate/gen_linux.sh --replace-fail 'exit 1' ""
   '';
   preBuild = ''
     # disable uses of `git`, since nix removes the git directory

So I run the daemon here:

HSA_OVERRIDE_GFX_VERSION="11.0.0" ~/..code/nixpkgs/result/bin/ollama serve

and then in another window

$ ~/..code/nixpkgs/result/bin/ollama --version
ollama version is 0.1.37

~                                                                                                       lhuhn@chlorine
$ ~/..code/nixpkgs/result/bin/ollama run llama3
pulling manifest 
pulling 00e1317cbf74... 100% ▕███████████████████████████████████████████████████████▏ 4.7 GB                         
pulling 4fa551d4f938... 100% ▕███████████████████████████████████████████████████████▏  12 KB                         
pulling 8ab4849b038c... 100% ▕███████████████████████████████████████████████████████▏  254 B                         
pulling 577073ffcc6c... 100% ▕███████████████████████████████████████████████████████▏  110 B                         
pulling ad1518640c43... 100% ▕███████████████████████████████████████████████████████▏  483 B                         
verifying sha256 digest 
writing manifest 
removing any unused layers 
success 
>>> Hello.
Hello! It's nice to meet you. Is there something I can help you with, or would you like to chat?

>>>

and then checking back with the logs in the first window:

time=2024-05-14T13:38:58.973-04:00 level=INFO source=amd_linux.go:304 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=11.0.0

...

ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon RX 7900 XT, compute capability 11.0, VMM: no
llm_load_tensors: ggml ctx size =    0.30 MiB
llm_load_tensors: offloading 32 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 33/33 layers to GPU
llm_load_tensors:      ROCm0 buffer size =  4155.99 MiB
llm_load_tensors:        CPU buffer size =   281.81 MiB

volfyd · 2024-05-14T20:55:01Z

I made an override. It's really messy because overriding Go builds is really messy.

_: super: {
  ollama = let
    version = "0.1.37";
    hash = "sha256-ZorOrIOWjXltxqOXNkFJ9190EXTAn+YcjZZhDBJsLqc=";
    vendorHash = "sha256-zOQGhNcGNlQppTqZdPfx+y4fUrxH0NOUl38FN8J6ffE=";
    src = super.fetchFromGitHub {
      owner = "jmorganca";
      repo = "ollama";
      rev = "v${version}";
      inherit hash;
      fetchSubmodules = true;
    };
  in
    (super.ollama.overrideAttrs (old: rec {
      inherit version src;
      patches = let
        preparePatch = patch: hash:
          super.fetchpatch {
            url = "file://${src}/llm/patches/${patch}";
            inherit hash;
            stripLen = 1;
            extraPrefix = "llm/llama.cpp/";
          };
      in
        [(builtins.head old.patches)]
        ++ [
          (preparePatch "02-clip-log.diff" "sha256-rMWbl3QgrPlhisTeHwD7EnGRJyOhLB4UeS7rqa0tdXM=")
          (preparePatch "03-load_exception.diff" "sha256-1DfNahFYYxqlx4E4pwMKQpL+XR0bibYnDFGt6dCL4TM=")
          (preparePatch "04-metal.diff" "sha256-Ne8J9R8NndUosSK0qoMvFfKNwqV5xhhce1nSoYrZo7Y=")
          (preparePatch "05-clip-fix.diff" "sha256-rCc3xNuJR11OkyiXuau8y46hb+KYk40ZqH1Llq+lqWc=")
        ];

      postPatch = ''
        # replace inaccurate version number with actual release version
        substituteInPlace version/version.go --replace-fail 0.0.0 '${version}'
        substituteInPlace llm/generate/gen_linux.sh --replace-fail 'exit 1' ""
      '';
    }))
    .override {
      buildGo122Module = args:
        super.buildGo122Module (args
          // {
            inherit version src vendorHash;
            postFixup = let
              runtimeLibs = [
                super.rocmPackages.clr
                super.rocmPackages.hipblas
                super.rocmPackages.rocblas
                super.rocmPackages.rocsolver
                super.rocmPackages.rocsparse
                super.rocmPackages.rocm-device-libs
                (super.linkFarm
                  "rocm-clang"
                  {
                    llvm = super.rocmPackages.llvm.clang;
                  })
                super.rocmPackages.rocm-smi
              ];
              rocmPath = super.buildEnv {
                name = "rocm-path";
                paths = runtimeLibs;
              };
            in ''
              # the app doesn't appear functional at the moment, so hide it
              mv "$out/bin/app" "$out/bin/.ollama-app"
              # expose runtime libraries necessary to use the gpu
              mv "$out/bin/ollama" "$out/bin/.ollama-unwrapped"
              makeWrapper "$out/bin/.ollama-unwrapped" "$out/bin/ollama" --set-default HIP_PATH '${rocmPath}' \
                --suffix LD_LIBRARY_PATH : '/run/opengl-driver/lib:${super.lib.makeLibraryPath runtimeLibs}'
            '';
          });
    };
};

abysssol · 2024-05-15T03:53:03Z

@volfyd I tried setting HSA_OVERRIDE_GFX_VERSION to 10.3.0, but that seemed to have no effect. For reference, I have an rx 6950 xt, and it's worked correctly with previous versions of ollama.

Have you tested the changes on my branch to see if they work for you? If it doesn't work, then I must have just set something up wrong, but if it does work for you then there must be some strange hardware specific incompatibility.

volfyd · 2024-05-15T15:59:02Z

Your nixpkgs works for me if I use the HSA_OVERRIDE_GFX_VERSION="11.0.0" environment variable. (I omited some output):

$ mkdir ~/..code/abysssol
$ cd ~/..code/abysssol
$ git clone --reference ~/..code/nixpkgs https://github.com/abysssol/nixpkgs.git
$ cd nixpkgs
$ git checkout ollama-update-0.1.37                                             
$ sed -i -e 's/acceleration ? null/acceleration ? "rocm"/' pkgs/tools/misc/ollama/default.nix 
$ nix build .#ollama
$ sudo systemctl stop ollama                                                                  
$ result/bin/ollama serve
...
time=2024-05-15T10:21:12.215-04:00 level=INFO source=amd_linux.go:311 msg="no compatible amdgpu devices detected"
^C
$ HSA_OVERRIDE_GFX_VERSION="11.0.0" result/bin/ollama serve
...
time=2024-05-15T10:21:48.861-04:00 level=INFO source=amd_linux.go:304 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=11.0.0
time=2024-05-15T10:21:48.861-04:00 level=INFO source=types.go:71 msg="inference compute" id=0 library=rocm compute=gfx1100 driver=0.0 name=1002:744c total="20.0 GiB" available="20.0 GiB"
^C

So then I got curious, and I removed the change I proposed (add the libraries as runtime dependencies). Note that the diff is relative to your commit, so it is showing me removing the runtime deps.

$ vi pkgs/tools/misc/ollama/default.nix
$ git diff
diff --git a/pkgs/tools/misc/ollama/default.nix b/pkgs/tools/misc/ollama/default.nix
index dbe46f2fc4c6..69fe04da371c 100644
--- a/pkgs/tools/misc/ollama/default.nix
+++ b/pkgs/tools/misc/ollama/default.nix
@@ -24,7 +24,7 @@
 
 , config
   # one of `[ null false "rocm" "cuda" ]`
-, acceleration ? null
+, acceleration ? "rocm"
 }:
 
 let
@@ -107,9 +107,9 @@ let
   };
 
   runtimeLibs = lib.optionals enableRocm
-    (rocmLibs ++ [
+    [
       rocmPackages.rocm-smi
-    ])
+    ]
   ++ lib.optionals enableCuda [
     linuxPackages.nvidia_x11
   ];
$ nix build .#ollama
$ HSA_OVERRIDE_GFX_VERSION="11.0.0" result/bin/ollama serve
...
time=2024-05-15T10:47:09.055-04:00 level=INFO source=types.go:71 msg="inference compute" id=0 library=rocm compute=gfx1100 driver=0.0 name=1002:744c total="20.0 GiB" available="20.0 GiB"
^C

And that still worked! So my theory may be incorrect. I don't know why this is working on my system and not yours. Perhaps I have something different in my NixOS config? I will try again, reverting even more of the change...

volfyd · 2024-05-15T16:14:48Z

OK, so here I am trying it with most of the "fixes" removed (except the build fix related to the exit 1)

I git reset to peel off your commit, and then unstage the bits I don't want, run git restore to get rid of them. Then the diff below is relative to a plain nixpkgs.

$ git reset --soft HEAD~1
$ lazygit
$ git restore .
$ sed -i -e 's/acceleration ? null/acceleration ? "rocm"/' pkgs/tools/misc/ollama/default.nix 
$ git diff HEAD
diff --git a/pkgs/tools/misc/ollama/default.nix b/pkgs/tools/misc/ollama/default.nix
index fdda6ba3f1e8..2cd2792da1f0 100644
--- a/pkgs/tools/misc/ollama/default.nix
+++ b/pkgs/tools/misc/ollama/default.nix
@@ -24,28 +24,30 @@
 
 , config
   # one of `[ null false "rocm" "cuda" ]`
-, acceleration ? null
+, acceleration ? "rocm"
 }:
 
 let
   pname = "ollama";
   # don't forget to invalidate all hashes each update
-  version = "0.1.31";
+  version = "0.1.37";
 
   src = fetchFromGitHub {
     owner = "jmorganca";
     repo = "ollama";
     rev = "v${version}";
-    hash = "sha256-Ip1zrhgGpeYo2zsN206/x+tcG/bmPJAq4zGatqsucaw=";
+    hash = "sha256-ZorOrIOWjXltxqOXNkFJ9190EXTAn+YcjZZhDBJsLqc=";
     fetchSubmodules = true;
   };
-  vendorHash = "sha256-Lj7CBvS51RqF63c01cOCgY7BCQeCKGu794qzb/S80C0=";
+  vendorHash = "sha256-zOQGhNcGNlQppTqZdPfx+y4fUrxH0NOUl38FN8J6ffE=";
   # ollama's patches of llama.cpp's example server
   # `ollama/llm/generate/gen_common.sh` -> "apply temporary patches until fix is upstream"
   # each update, these patches should be synchronized with the contents of `ollama/llm/patches/`
   llamacppPatches = [
+    (preparePatch "02-clip-log.diff" "sha256-rMWbl3QgrPlhisTeHwD7EnGRJyOhLB4UeS7rqa0tdXM=")
     (preparePatch "03-load_exception.diff" "sha256-1DfNahFYYxqlx4E4pwMKQpL+XR0bibYnDFGt6dCL4TM=")
-    (preparePatch "04-locale.diff" "sha256-r5nHiP6yN/rQObRu2FZIPBKpKP9yByyZ6sSI2SKj6Do=")
+    (preparePatch "04-metal.diff" "sha256-Ne8J9R8NndUosSK0qoMvFfKNwqV5xhhce1nSoYrZo7Y=")
+    (preparePatch "05-clip-fix.diff" "sha256-rCc3xNuJR11OkyiXuau8y46hb+KYk40ZqH1Llq+lqWc=")
   ];
 
   preparePatch = patch: hash: fetchpatch {
@@ -161,10 +163,10 @@ goBuild ((lib.optionalAttrs enableRocm {
     # this also disables necessary patches contained in `ollama/llm/patches/`
     # those patches are added to `llamacppPatches`, and reapplied here in the patch phase
     ./disable-git.patch
+    # TODO: add reason
+    ./disable-lib-check.patch
   ] ++ llamacppPatches;
   postPatch = ''
-    # replace a hardcoded use of `g++` with `$CXX` so clang can be used on darwin
-    substituteInPlace llm/generate/gen_common.sh --replace-fail 'g++' '$CXX'
     # replace inaccurate version number with actual release version
     substituteInPlace version/version.go --replace-fail 0.0.0 '${version}'
   '';
diff --git a/pkgs/tools/misc/ollama/disable-lib-check.patch b/pkgs/tools/misc/ollama/disable-lib-check.patch
new file mode 100644
index 000000000000..8ce5fcb04e25
--- /dev/null
+++ b/pkgs/tools/misc/ollama/disable-lib-check.patch
@@ -0,0 +1,10 @@
+--- a/llm/generate/gen_linux.sh
++++ b/llm/generate/gen_linux.sh
+@@ -245,7 +245,6 @@
+     if [ $(cat "${BUILD_DIR}/bin/deps.txt" | wc -l ) -lt 8 ] ; then
+         cat "${BUILD_DIR}/bin/deps.txt"
+         echo "ERROR: deps file short"
+-        exit 1
+     fi
+     compress
+ fi
$ nix build .#ollama
$ HSA_OVERRIDE_GFX_VERSION="11.0.0" result/bin/ollama serve
...
time=2024-05-15T12:10:24.603-04:00 level=INFO source=types.go:71 msg="inference compute" id=0 library=rocm compute=gfx1100 driver=0.0 name=1002:744c total="20.0 GiB" available="20.0 GiB"
^C

And you can see that it still works! I should probably take ollama out of my NixOS config and see if that makes a difference.

volfyd · 2024-05-15T18:56:21Z

So in my NixOS config, I have

hardware.opengl.extraPackages = with pkgs; [
  rocmPackages.clr.icd #following for GPU AI acceleration
  rocmPackages.rocm-smi
  rocmPackages.clr
  rocmPackages.hipblas
  rocmPackages.rocblas
  rocmPackages.rocsolver
  rocmPackages.rocm-comgr
  rocmPackages.rocm-runtime
  rocmPackages.rocsparse

  # unnecessary for OLLAMA ?
  amdvlk
  rocm-opencl-icd #gaming?
  rocm-opencl-runtime #gaming?
  libva #some hardware acceleration for stuff like OBS
  vaapiVdpau
  libvdpau-va-gl
];

If I remove that, I get

time=2024-05-15T14:48:47.092-04:00 level=WARN source=amd_linux.go:278 msg="unable to verify rocm library, will use cpu" error="no suitable rocm found, falling back to CPU"

Higher up @abysssol you said

I did test what happens if I disable the check and run the compiled ollama binary anyway, but it just crashes almost instantly after running it.

I am not sure if "crash" means it exits with an error or without, but I have never seen it crash really. It either works if I run it with the environment variable and the necessary hardware.opengl.extraPackages or it falls back to CPU without crashing.

abysssol · 2024-05-16T05:03:04Z

I am not sure if "crash" means it exits with an error or without, but I have never seen it crash really. It either works if I run it with the environment variable and the necessary hardware.opengl.extraPackages or it falls back to CPU without crashing.

I seem to remember testing it a few weeks ago and it crashing without any output/logs/error, somewhat like a segfault or linker error (from unavailable libs). However, my current testing has never had this outcome either, so I'm not sure if it's a false memory, or if I confused crashing as an outcome of a different experiment as being the result of removing the lib check, or if a different ollama version had different behavior in this regard. You can probably safely ignore that comment, as it seems to be irrelevant/outdated/false. My apologies for the confusion.

I tried adding the packages from your hardware.opengl.extraPackages into both the ollama package's runtimeLibs and into my hardware.opengl.extraPackages, and I tried this both with and without HSA_OVERRIDE_GFX_VERSION set. Unfortunately, I've still been unable to get ollama to use rocm. It seems that my ability to contribute to figuring this out will be greatly reduced as ollama seems simply unwilling to run rocm on my machine.

However, since this seems to work right for you it may work right for others, so it's probably worth putting these changes in whatever version ends up getting into unstable to at least allow correct functionality for some users.

Could you try removing everything from your hardware.opengl.extraPackages and adding that into the ollama package's runtimeLibs to see if that correctly exposes those libraries to ollama?

   runtimeLibs = lib.optionals enableRocm
     (rocmLibs ++ [
       rocmPackages.rocm-smi
+      rocmPackages.clr.icd
+      rocmPackages.rocm-comgr
+      rocmPackages.rocm-runtime
+      pkgs.rocm-opencl-icd
+      pkgs.rocm-opencl-runtime
+      pkgs.libva
+      pkgs.vaapiVdpau
+      pkgs.libvdpau-va-gl
+      pkgs.amdvlk
     ])
   ++ lib.optionals enableCuda [
     linuxPackages.nvidia_x11
   ];

If that works, then try removing libraries to find the minimum necessary libraries for ollama to function correctly. That way someone can use ollama without having to first add anything into their hardware.opengl.extraPackages, as the package will be self-contained.

Thank you for all your help with this.

volfyd · 2024-05-16T19:28:21Z

Here are the actual packages I need in my NixOS

  # With these packages in NixOS config, ollama 0.1.37 works.
  hardware.opengl.extraPackages = with pkgs; [
    rocmPackages.hipblas
    rocmPackages.rocblas
];

Putting them into the nixpkgs definition instead doesn't work

  # This incorrectly falls back to CPU
  runtimeLibs = lib.optionals enableRocm [
    rocmPackages.rocm-smi
    rocmPackages.hipblas
    rocmPackages.rocblas
  ] ++ lib.optionals enableCuda [
    linuxPackages.nvidia_x11
  ];

nix build .#ollama
HSA_OVERRIDE_GFX_VERSION="11.0.0" result/bin/ollama serve
...
time=2024-05-16T15:23:57.751-04:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 rocm cpu cpu_avx]"
time=2024-05-16T15:23:57.751-04:00 level=WARN source=amd_linux.go:48 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2024-05-16T15:23:57.752-04:00 level=WARN source=amd_linux.go:346 msg="amdgpu detected, but no compatible rocm library found.  Either install rocm v6, or follow manual install instructions at https://github.com/ollama/ollama/blob/main/docs/linux.md#manual-install"
time=2024-05-16T15:23:57.752-04:00 level=WARN source=amd_linux.go:278 msg="unable to verify rocm library, will use cpu" error="no suitable rocm found, falling back to CPU"

I don't have an explanation

volfyd · 2024-05-16T21:33:47Z

I have an explanation!

ollama looks for rocmhipblas and rocmlocblas and requires them to be in the same directory. They can be in the same directory if they are installed by the NixOS system, but otherwise they will each be in their own directory.

https://github.com/ollama/ollama/blob/5bece945090b94a3f1eab03be48fb6f6b25e1e79/gpu/amd_linux.go#L33

	ROCmLibGlobs          = []string{"libhipblas.so.2*", "rocblas"} // TODO - probably include more coverage of files here...

https://github.com/ollama/ollama/blob/main/gpu/amd_common.go#L14

// Determine if the given ROCm lib directory is usable by checking for existence of some glob patterns
func rocmLibUsable(libDir string) bool {
	slog.Debug("evaluating potential rocm lib dir " + libDir)
	for _, g := range ROCmLibGlobs {
		res, _ := filepath.Glob(filepath.Join(libDir, g))
		if len(res) == 0 {
			return false
		}
	}
	return true
}

volfyd · 2024-05-17T21:08:48Z

One thing to note. If I make the check (the check for hipblas and rocblas in the same directory) less strict, to only require that rocblas is found, then ollama appears to work with the graphics card.

diff --git a/pkgs/tools/misc/ollama/default.nix b/pkgs/tools/misc/ollama/default.nix
index 861f0901ea88..66aef69cd10e 100644
--- a/pkgs/tools/misc/ollama/default.nix
+++ b/pkgs/tools/misc/ollama/default.nix
@@ -24,7 +24,7 @@
 
 , config
   # one of `[ null false "rocm" "cuda" ]`
-, acceleration ? null
+, acceleration ? "rocm"
 }:
 
 let
@@ -106,24 +106,8 @@ let
 
   runtimeLibs = lib.optionals enableRocm [
     rocmPackages.rocm-smi
-    rocmPackages.clr.icd #following for GPU AI acceleration
-    rocmPackages.clr
     rocmPackages.hipblas
     rocmPackages.rocblas
-
-    rocmPackages.rocsolver
-    rocmPackages.rocm-comgr
-    rocmPackages.rocm-runtime
-    rocmPackages.rocsparse
-
-    # unnecessary for OLLAMA ?
-    pkgs.amdvlk
-    pkgs.rocm-opencl-icd #gaming?
-    pkgs.rocm-opencl-runtime #gaming?
-    pkgs.libva #some hardware acceleration for stuff like OBS
-    pkgs.vaapiVdpau
-    pkgs.libvdpau-va-gl
-
   ] ++ lib.optionals enableCuda [
     linuxPackages.nvidia_x11
   ];
@@ -187,6 +171,7 @@ goBuild ((lib.optionalAttrs enableRocm {
   postPatch = ''
     # replace inaccurate version number with actual release version
     substituteInPlace version/version.go --replace-fail 0.0.0 '${version}'
+    substituteInPlace gpu/amd_linux.go --replace-fail '"libhipblas.so.2*", "rocblas"' '"rocblas"'
   '';
   preBuild = ''
     # disable uses of `git`, since nix removes the git directory

ollama does not appear to explicitly load hipblas at runtime (but it does use it at build time) so I think the check in the code is just overly strict.

abysssol · 2024-05-18T07:34:02Z

I've finally managed to figure out how to get the rocm build to work for me, all thanks to you @volfyd!
I couldn't have done it without your help; I was stuck, and had entirely given up on getting ollama to work right with rocm.

I've created pr #312608 for further testing and discussion. If all goes well (rocm works correctly for others and no further issues are found), I want to merge it as soon as the next stable nixos version is released.

NixOSInfra added the 12. first-time contribution This PR is the author's first one; please be gentle! label May 5, 2024

ofborg bot requested review from dit7ya, elohmeier and abysssol May 5, 2024 18:18

ofborg bot added 10.rebuild-darwin: 1-10 10.rebuild-darwin: 1 10.rebuild-linux: 1-10 10.rebuild-linux: 1 labels May 5, 2024

drupol requested changes May 5, 2024

View reviewed changes

ollama: 0.1.31 -> 0.1.33

4679652

https://github.com/ollama/ollama/releases/tag/v0.1.33

nsbuitrago force-pushed the update-ollama branch from 5edad74 to 4679652 Compare May 5, 2024 21:29

abysssol requested changes May 6, 2024

View reviewed changes

drupol mentioned this pull request May 6, 2024

ollama: 0.1.31 -> 0.1.33 #309647

Closed

13 tasks

abysssol mentioned this pull request May 18, 2024

ollama: 0.1.31 -> 0.1.38 #312608

Merged

13 tasks

nsbuitrago closed this May 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ollama 0.1.31 -> 0.1.33 #309330

ollama 0.1.31 -> 0.1.33 #309330

nsbuitrago commented May 5, 2024

drupol left a comment

abysssol commented May 5, 2024

abysssol left a comment •

edited

Loading

onny commented May 10, 2024

onny commented May 10, 2024

redyf commented May 11, 2024

ejiektpobehuk commented May 12, 2024

nsbuitrago commented May 13, 2024 •

edited

Loading

volfyd commented May 13, 2024 •

edited

Loading

abysssol commented May 14, 2024

volfyd commented May 14, 2024

volfyd commented May 14, 2024

abysssol commented May 15, 2024

volfyd commented May 15, 2024 •

edited

Loading

volfyd commented May 15, 2024

volfyd commented May 15, 2024

abysssol commented May 16, 2024

volfyd commented May 16, 2024 •

edited

Loading

volfyd commented May 16, 2024

volfyd commented May 17, 2024

abysssol commented May 18, 2024

ollama 0.1.31 -> 0.1.33 #309330

ollama 0.1.31 -> 0.1.33 #309330

Conversation

nsbuitrago commented May 5, 2024

Description of changes

Things done

drupol left a comment

Choose a reason for hiding this comment

abysssol commented May 5, 2024

abysssol left a comment • edited Loading

Choose a reason for hiding this comment

onny commented May 10, 2024

onny commented May 10, 2024

redyf commented May 11, 2024

ejiektpobehuk commented May 12, 2024

nsbuitrago commented May 13, 2024 • edited Loading

volfyd commented May 13, 2024 • edited Loading

abysssol commented May 14, 2024

volfyd commented May 14, 2024

volfyd commented May 14, 2024

abysssol commented May 15, 2024

volfyd commented May 15, 2024 • edited Loading

volfyd commented May 15, 2024

volfyd commented May 15, 2024

abysssol commented May 16, 2024

volfyd commented May 16, 2024 • edited Loading

volfyd commented May 16, 2024

volfyd commented May 17, 2024

abysssol commented May 18, 2024

abysssol left a comment •

edited

Loading

nsbuitrago commented May 13, 2024 •

edited

Loading

volfyd commented May 13, 2024 •

edited

Loading

volfyd commented May 15, 2024 •

edited

Loading

volfyd commented May 16, 2024 •

edited

Loading